Search CORE

57 research outputs found

Improving Non-autoregressive Translation Quality with Pretrained Language Model, Embedding Distillation and Upsampling Strategy for CTC

Author: Lee Hung-yi
Syu Shen-sian
Xie Juncheng
Publication venue
Publication date: 30/08/2023
Field of study

Non-autoregressive approaches aim to improve the inference speed of translation models, particularly those that generate output in a one-pass forward manner. However, these approaches often suffer from a significant drop in translation quality compared to autoregressive models. This paper introduces a series of innovative techniques to enhance the translation quality of Non-Autoregressive Translation (NAT) models while maintaining a substantial acceleration in inference speed. We propose fine-tuning Pretrained Multilingual Language Models (PMLMs) with the CTC loss to train NAT models effectively. Furthermore, we adopt the MASK insertion scheme for up-sampling instead of token duplication, and we present an embedding distillation method to further enhance performance. In our experiments, our model outperforms the baseline autoregressive model (Transformer \textit{base}) on multiple datasets, including WMT'14 DE

\leftrightarrow

EN, WMT'16 RO

\leftrightarrow

EN, and IWSLT'14 DE

\leftrightarrow

EN. Notably, our model achieves better performance than the baseline autoregressive model on the IWSLT'14 En

\leftrightarrow

De and WMT'16 En

\leftrightarrow

Ro datasets, even without using distillation data during training. It is worth highlighting that on the IWSLT'14 DE

\rightarrow

EN dataset, our model achieves an impressive BLEU score of 39.59, setting a new state-of-the-art performance. Additionally, our model exhibits a remarkable speed improvement of 16.35 times compared to the autoregressive model.Comment: 12 pages, 6 figure

arXiv.org e-Print Archive

RESEARCH ON IMMERSIVE SCENE DESIGN BASED ON MENTAL EXCLUSION THEORY

Author: Lu Yage
Mi Gaofeng
Wang Hui
Xie Qiankun
Zhou Juncheng
Publication venue
Publication date: 01/01/2022
Field of study

HRČAK - Portal of Croatian Scientific and Professional Journals

Second Triangular Hermite Spline Curves and Its Application

Author: DENG Wenbo
LI Juncheng
LIU Chunying
XIE Chun
YU Xing
Publication venue: Canadian Research & Development Center of Sciences and Cultures
Publication date: 31/07/2012
Field of study

Abstract: A class of rational square trigonometric spline is presented, which shares the same properties of normal cubic Hermite interpolation spline. The given spline can more approximate the interpolated curve than the ordinary polynomial cubic spline.Key words: Hermite spline curve; C2 continuous; Faultage area; Precisio

CSCanada.net: E-Journals (Canadian Academy of Oriental and Occidental Culture, Canadian Research & Development Center of Sciences and Cultures)

RESEARCH ON IMMERSIVE SCENE DESIGN BASED ON MENTAL EXCLUSION THEORY

Author: Lu Yage
Mi Gaofeng
Wang Hui
Xie Qiankun
Zhou Juncheng
Publication venue
Publication date: 01/01/2022
Field of study

HRČAK - Portal of Croatian Scientific and Professional Journals

Hrčak - Portal of scientific journals of Croatia

Frustratingly Easy Model Generalization by Dummy Risk Minimization

Author: Hu Xixu
Wang Jindong
Wang Juncheng
Wang Shujun
Xie Xing
Publication venue
Publication date: 04/08/2023
Field of study

Empirical risk minimization (ERM) is a fundamental machine learning paradigm. However, its generalization ability is limited in various tasks. In this paper, we devise Dummy Risk Minimization (DuRM), a frustratingly easy and general technique to improve the generalization of ERM. DuRM is extremely simple to implement: just enlarging the dimension of the output logits and then optimizing using standard gradient descent. Moreover, we validate the efficacy of DuRM on both theoretical and empirical analysis. Theoretically, we show that DuRM derives greater variance of the gradient, which facilitates model generalization by observing better flat local minima. Empirically, we conduct evaluations of DuRM across different datasets, modalities, and network architectures on diverse tasks, including conventional classification, semantic segmentation, out-of-distribution generalization, adverserial training, and long-tailed recognition. Results demonstrate that DuRM could consistently improve the performance under all tasks with an almost free lunch manner. Furthermore, we show that DuRM is compatible with existing generalization techniques and we discuss possible limitations. We hope that DuRM could trigger new interest in the fundamental research on risk minimization.Comment: Technical report; 22 page

arXiv.org e-Print Archive

Listen to Minority: Encrypted Traffic Classification for Class Imbalance with Contrastive Pre-Training

Author: Guo Juncheng
Li Xiang
Sang Yafei
Song Qige
Xie Jiang
Zhang Yongzheng
Zhao Shuyuan
Publication venue
Publication date: 06/09/2023
Field of study

Mobile Internet has profoundly reshaped modern lifestyles in various aspects. Encrypted Traffic Classification (ETC) naturally plays a crucial role in managing mobile Internet, especially with the explosive growth of mobile apps using encrypted communication. Despite some existing learning-based ETC methods showing promising results, three-fold limitations still remain in real-world network environments, 1) label bias caused by traffic class imbalance, 2) traffic homogeneity caused by component sharing, and 3) training with reliance on sufficient labeled traffic. None of the existing ETC methods can address all these limitations. In this paper, we propose a novel Pre-trAining Semi-Supervised ETC framework, dubbed PASS. Our key insight is to resample the original train dataset and perform contrastive pre-training without using individual app labels directly to avoid label bias issues caused by class imbalance, while obtaining a robust feature representation to differentiate overlapping homogeneous traffic by pulling positive traffic pairs closer and pushing negative pairs away. Meanwhile, PASS designs a semi-supervised optimization strategy based on pseudo-label iteration and dynamic loss weighting algorithms in order to effectively utilize massive unlabeled traffic data and alleviate manual train dataset annotation workload. PASS outperforms state-of-the-art ETC methods and generic sampling approaches on four public datasets with significant class imbalance and traffic homogeneity, remarkably pushing the F1 of Cross-Platform215 with 1.31%, ISCX-17 with 9.12%. Furthermore, we validate the generality of the contrastive pre-training and pseudo-label iteration components of PASS, which can adaptively benefit ETC methods with diverse feature extractors.Comment: Accepted by 2023 20th Annual IEEE International Conference on Sensing, Communication, and Networking, 9 pages, 6 figure

arXiv.org e-Print Archive